An Efficient Agglomerative Clustering Algorithm for Web Navigation Pattern Identification

نویسنده

  • A. Anitha
چکیده

Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketing, for web personalization, to predict web access sequence etc. In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website. Using this approach, web usage mining is done through different stages namely data cleaning, preprocessing, pattern discovery and pattern analysis. Results are given to explain how this approach produces tight usage clusters than the existing web usage mining techniques. Rather than traditional distance based clustering, the similarity measure is considered during clustering process in order to reduce computational complexity. This paper also deals with the problem of assessing the quality of user session clusters and cluster validity is measured by using statistical test, which measures the distances of clusters distributions to infer their dissimilarity and distinguish level. Using such statistical measures, it is proved that cluster accuracy is improved to the extent of 0.83, over existing k-means clustering with validity measure 0.26, FCM (Fuzzy C Means) clustering with validity measure 0.56. Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved Web Log Mining and Online Navigational Pattern Prediction

The aim of this study is to improve web log mining and online navigation pattern prediction. Web mining is an active and wide area which incorporates several usages for the web site design, providing personalization server and other business making decisions etc. Efficient web log mining results and online navigational pattern prediction is a tough process due to vast development in web. It inc...

متن کامل

Modified Agglomerative Clustering for Web Users Navigation Behavior

----------------------------------------------------------------------ABSTRACT----------------------------------------------------------------------------Clustering is the task of finding natural partitioning within a data set such that data items within the same group are more similar than those within different groups. At present all available clustering techniques for web usage mining are ba...

متن کامل

Agglomerative Clustering in Web Usage Mining: A Survey

Web Usage Mining used to extract knowledge from WWW. Nowadays interaction of user towards web data is growing, web usage mining is significant in effective website management, adaptive website creation, support services, personalization, and network traffic flow analysis and user trend analysis and user’s profile also helps to promote website in ranking. Agglomerative clustering is a most flexi...

متن کامل

Implementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering

We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularit...

متن کامل

User Navigation Pattern Discovery using Fast Adaptive Neuro-Fuzzy Inference System

World Wide Web is a huge repository of web pages and links. It provides abundance information for the Internet users. The growth of web is incredible as it can be seen in present days. Users’ accesses are recorded in web logs. From the user’s perspective, it is very difficult to extract useful knowledge from the huge amount of information and secondly, it is also difficult to extract for the us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016